🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
⚡ SIMD Vectorization

AVX Instructions, Parallel Data Processing, Compiler Optimization, Performance

Basic facts about GPUs
damek.github.io·3d·
Discuss: Lobsters, Hacker News, Hacker News, r/programming
🖥️Hardware Architecture
Introducing ZMatrix: High-Performance Tensor Operations for PHP
dev.to·14h·
Discuss: DEV
🚀SIMD Parsing
Meet Mojo: The Language That Could Replace Python, C++, and CUDA
hackernoon.com·6h
⬆️Lambda Lifting
Machine Learning Fundamentals: accuracy with python
dev.to·22h·
Discuss: DEV
👁️Observatory Systems
OBK-RCM: Accelerated Orthogonal Block Kaczmarz Algorithm via RCM Reordering and Dynamic Grouping for Sparse Linear Systems
arxiv.org·1d
📐Linear Algebra
Speculative Optimizations for WebAssembly using Deopts and Inlining
v8.dev·1d·
Discuss: Hacker News, r/javascript, r/webdev
🦀Rust Macros
Ask HN: Feedback on "QSS" – A Quantized Vector Search Engine in C
news.ycombinator.com·1d·
Discuss: Hacker News
🗂️Vector Databases
Speed Up Python Loops: Proven Techniques To Make Your Code Faster
thenewstack.io·1d
🧮Compute Optimization
Scaling Pinterest ML Infrastructure with Ray: From Training to End-to-End ML Pipelines
medium.com·22h·
Discuss: Hacker News
🧮Z3 Applications
Overtuning in Hyperparameter Optimization
arxiv.org·10h
🧠Machine Learning
Introduction to CUDA Programming With GPU Puzzles
henryhmko.github.io·2d
🔩Systems Programming
Rust SIMD Programming: Accelerate Performance with Vectorized Instructions and Parallel Processing
dev.to·5d·
Discuss: DEV
🚀SIMD Parsing
VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM
arxiv.org·1d
🏭Compiler Backends
Greedy Is Good. Less Greedy May Be Better
gojiberries.io·13h·
Discuss: Hacker News
🧮Kolmogorov Complexity
The Interactive Handbook on Data Structures and Algorithms
cartesian.app·1d·
Discuss: Lobsters, Hacker News, Hacker News
🌳Trie Structures
AMD researchers reduce graphics card VRAM capacity of 3D-rendered trees from 38GB to just 52 KB with work graphs and mesh nodes — shifting CPU work to the GPU y...
tomshardware.com·3h
🖥️Modern Terminals
Black-Box Test Code Fault Localization Driven by Large Language Models and Execution Estimation
arxiv.org·10h
🔍Concolic Testing
How much slower is random access, really?
samestep.com·1d·
Discuss: Hacker News
📼Tape Encoding
Fully lifted \emph{blirp} interpolation -- a large deviation view
arxiv.org·10h
🌀Fractal Compression
Programming, Not Prompting: A Hands-On Guide to DSPy
towardsdatascience.com·1d
🧮Z3 Solver
Loading...Loading more...
AboutBlogChangelogRoadmap